Search CORE

48 research outputs found

Resolving XML Semantic Ambiguity

Author: Gilbert Tekli
Joe Tekli
Nathalie Charbel
Richard Chbeir
Publication venue
Publication date: 31/03/2020
Field of study

ABSTRACT XML semantic-aware processing has become a motivating and important challenge in Web data management, data processing, and information retrieval. While XML data is semi-structured, yet it remains prone to lexical ambiguity, and thus requires dedicated semantic analysis and sense disambiguation processes to assign well-defined meaning to XML elements and attributes. This becomes crucial in an array of applications ranging over semantic-aware query rewriting, semantic document clustering and classification, schema matching, as well as blog analysis and event detection in social networks and tweets. Most existing approaches in this context: i) ignore the problem of identifying ambiguous XML nodes, ii) only partially consider their structural relations/context, iii) use syntactic information in processing XML data regardless of the semantics involved, and iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDF designed to address each of the above motivations, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of our approach in comparison with alternative methods. Categories and Subject Descriptors General Terms Algorithms, Measurement, Performance, Design, Experimentation. Keywords XML semantic-aware processing, a m b i g u i t y d e g r e e , s p h e r e neighborhood, XML context vector, semantic network, semantic disambiguation

CiteSeerX

SemIndex: Semantic-Aware Inverted Index

Author: Al Assad Marc
Chbeir Richard
Luo Yi
Raymundo Ibañez Carlos Arturo
Tekli Joe
Traina Jr Caetano
Traina Agma J. M.
Universidad Peruana de Ciencias Aplicadas (UPC)
Yetongnon Kokou
Publication venue: Springer International Publishing
Publication date: 01/01/2014
Field of study

[email protected] paper focuses on the important problem of semanticaware search in textual (structured, semi-structured, NoSQL) databases. This problem has emerged as a required extension of the standard containment keyword based query to meet user needs in textual databases and IR applications. We provide here a new approach, called SemIndex, that extends the standard inverted index by constructing a tight coupling inverted index graph that combines two main resources: a general purpose semantic network, and a standard inverted index on a collection of textual data. We also provide an extended query model and related processing algorithms with the help of SemIndex. To investigate its effectiveness, we set up experiments to test the performance of SemIndex. Preliminary results have demonstrated the effectiveness, scalability and optimality of our approach.This study is partly funded by: Bourgogne Region program, CNRS, and STIC AmSud project Geo-Climate XMine, and LAU grant SOERC-1314T012.Revisión por pare

HAL-uB

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Repositorio Académico UPC

Almost Linear Semantic XML Keyword Search

Author: Chbeir Richard
Tekli Gilbert
Tekli Joe
Publication venue: ACM
Publication date: 01/11/2021
Field of study

International audienc

HAL Descartes

Hal-Diderot

An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges

Author: Joe Tekli
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Semantic and Structure Based XML Similarity: An integrated Approach

Author: Joe Tekli
Publication venue
Publication date
Field of study

Since the last decade, XML has gained growing importance as a major means for information management, and has become inevitable for complex data representation. Due to an unprecedented increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes crucial in information retrieval (IR) research. A range of algorithms for comparing hierarchically structured data, e.g. XML documents, have been proposed in the literature. However, to our knowledge, most of them focus exclusively on comparing documents based on structural features, overlooking the semantics involved. In this paper, we deal with this problem and introduce a combined structural/semantic XML similarity approach. Our method integrates IR semantic similarity assessment in an edit distance algorithm, seeking to amend similarity judgments when comparing XML-based documents. Different from previous works, our approach comprises of an original edit distance operation cost model, introducing semantic relatedness of XML element/attribute labels, in traditional edit distance computations. A discussion about our similarity method’s properties, chiefly symmetricity and triangular inequality, with respect to existing measures in the literature is provided here. A prototype has been developed to evaluate the performance of our approach. Experimental results were noticeable. 1

CiteSeerX

Minimizing user effort in XML grammar matching

Author: Chbeir Richard
Tekli Joe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

International audienceXML grammar matching has found considerable interest recently, due to the growing number of heterogeneous XML documents on the Web, and the need to integrate, search and retrieve XML documents originated from different data sources. In this study, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. This requires (i) considering the various characteristics and constraints of XML grammars (in comparison with 'grammar simplifying' approaches), (ii) allowing a flexible combination of different matching criteria (in comparison with static approaches), and (iii) effectively considering the semi-structured nature of XML (in contrast with heuristic methods). To achieve this, we propose an extensible framework based on the concept of tree edit distance as an optimal technique to consider XML structure, integrating different matching criteria to capture all basic XML grammar characteristics, ranging over element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-type correspondences and relative ordering. In addition, our framework is flexible, enabling the user to choose mapping cardinality (i.e., 1:1,1:n,n:1,n:n), in comparison with exiting static methods (usually constrained to 1:1). User constraints and feedback are equally considered in order to adjust matching results to the user's perception of correct matches. Experiments on real and synthetic XML grammars demonstrate the effectiveness and efficiency of our matching strategy in identifying mappings, in comparison with alternative methods

Lebanese American University Repository

Hal-Diderot

Resolving XML Semantic Ambiguity

Author: Charbel Nathalie
Chbeir Richard
Tekli Gilbert
Tekli Joe
Publication venue: OpenProceedings.org, University of Konstanz, University Library
Publication date: 23/03/2015
Field of study

International audienceXML semantic-aware processing has become a motivating and important challenge in Web data management, data processing, and information retrieval. While XML data is semi-structured, yet it remains prone to lexical ambiguity, and thus requires dedicated semantic analysis and sense disambiguation processes to assign well-defined meaning to XML elements and attributes. This becomes crucial in an array of applications ranging over semantic-aware query rewriting, semantic document clustering and classification, schema matching, as well as blog analysis and event detection in social networks and tweets. Most existing approaches in this context: i) ignore the problem of identifying ambiguous XML nodes, ii) only partially consider their structural relations/context, iii) use syntactic information in processing XML data regardless of the semantics involved, and iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDF designed to address each of the above motivations, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of our approach in comparison with alternative method

HAL Descartes

Hal-Diderot

Semantic and Structure Based XML Similarity: The XS 3 Prototype

Author: Joe Tekli
Publication venue
Publication date
Field of study

Due to the ever-increasing web availability of XML-based data, an efficient approach to compare XML documents becomes crucial in information retrieval. Such comparison of XML documents has applications i

CiteSeerX

XSDF: A System for XML Semantic Disambiguation

Author: Charbel Nathalie
Chbeir Richard
Tekli Gilbert
Tekli Joe
Publication venue: IEEE
Publication date: 08/10/2015
Field of study

International audienceThis paper briefly describes and evaluates XSDF, a new XML Semantic Disambiguation Framework, taking as input: an XML document and a general purpose semantic network, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts. Experiments demonstrate the effectiveness of XSDF in comparison with alternative methods

Crossref

HAL Descartes

Hal-Diderot

Generic metadata representation framework for social-based event detection, description, and linkage

Author: Abebe Minale
Chbeir Richard
Getahun Fekade
Tekli Gilbert
Tekli Joe
Publication venue: Elsevier
Publication date: 01/01/2020
Field of study

International audienc

HAL Descartes

Hal-Diderot